Establishing Traceability Links between Unit Test Cases and Units under Test (2009)

motivation

In a context of legacy systems and reengineering, tests are valuable as a source of up-to-date documentation 11. Tests can be perceived as examples how to use part of a system

Composing tests from examples (2007)

しかし、実装からテストコードにシュッと飛ぶ方法は今の所ない

As such, developers can have a hard time ﬁnding the right tests for a programming task.

Eclipse には、Eclipse で定義したテストの対応を覚えておく機能はあるけど、全部のテストがカバーできるわけもなく...

The Eclipse Java environment2 suggests, when creating a new test case via a wizard, to provide the corresponding unit(s) under test. The link is then documented by means of a Javadoc annotation. Unfortunately, developers are free not to use this wizard, let alone completing the unit under test. Eclipse also oﬀers a “referring tests” feature that provides a list of (JUnit) test cases that statically ’use’ a production class.

hr.icon

どんなもの？

以下の6つのデータから test-to-code traceability が復元できるんじゃないかという推論

(i) a test case naming convention;

(ii) explicit ﬁxture declaration;

(iii) static test call graphs;

(iv) run time traces;

(v) lexical analysis; and

(vi) co-evolution logs are all viable means to

これに基づいていくつかの test-to-code traceabililty を計算する手法を開発、精度の比較を行なった

Naminig Convention (NC)

Detecting the methods under test in java (2005) と同じような方法?

testMethod1 なら method1 が method under test みたいな

命名規則に従ってないテストの test-to-code traceability は当然解析できない、常にひとつのメソッドしか候補にならない

Fixture Element Types (FET)

よくわからなかった

Static Call Graph (SCG)

テスト中から参照されている class を取得

一番参照されているやつをテスト対象のクラスってことで

ヘルパー関数とかが間違って検出されがち

The drawback of this approach, therefore, is the potential large set of helper and data object types that will be included into the unit under test set in case there is no dominantly called production class

Last Call Before Assert (LCBA)

SCGの問題はヘルパー関数の大量の呼び出しがノイズになること、これを除くために

looking at what happens right before the assert statemen

一つのテストに対してアサーションが複数ある場合困る

これは後にテストは複数のサブシナリオからなることが言われている Automatically identifying focal methods under test in unit test cases (2015) など

We decided to use dynamic analysis for the LCBA resolution strategy, as to better deal with (i) polymorphism; (ii) conditional logic in test cases; and (iii) abstractions such as separate veriﬁcation mechanisms.

(iii) は何?

Lexical Analysis (LA)

Our assumption is that a test case and the corresponding unit under test contain very similar vocabulary.

LSI を使って、テストクラスと、production class (普通のクラス) 間の vocaburary の類似度を計算、似てるものを結びつける

Co-Evolution (Co-Ev)

Test cases and their corresponding unit under test, we reason, ought to change together throughout time, as a change to the unit under test requires some modiﬁcations to the test case as well.

VCS の履歴をマイニングして、同時に変更された class とテストを traceability として認める

頻繁に更新されるクラスがテスト対象としてご認識されるリスク

With this approach, we risk to wrongly identify production ﬁles that change very frequently as the unit under test of a variety of test cases.

hr.icon

どうやって有効だと検証した？

Test oracle として、人間にテストと Class under test をアノテーションしてもらう

それぞれの手法を適用して、3つのメトリクスを計算

Applicability: retrieved UUT (unit under test) が見付けられた Test case の割合

Precision retrieved UUT のうち、実際に test oracle と一致する割合

Recall: test と code の traceability のうち、実際に test oracle と一致した割合

https://scrapbox.io/files/60d770b380f183001c55352b.png

hr.icon

議論はある？

今回実験に使ったプロジェクトはまじできっちり naming convention に従ってたので NC の precision も recall もごりごり 100% になったけど、一般的いには naming convention によるやつはそんなうまく行かないはず

Lexical analysis は similarity threshold 低すぎて、その結果 applicability は高いけど precision と recall が低くなったかも? (再実験してよ)

Co-Ev は最も多く一緒に更新されたクラスを class under test としているので precision とかが微妙かも、集計方法もうちょい工夫する余地はあり

追加で複数のアルゴリズムの組み合わせ

(i) applying a strategy on test cases the previous strategy was not applicable to; and (ii) replacing the result of a previous strategy for a test case with that of the additional strategy when it improves the accuracy.

(ii) がだめすぎる、実験環境でしか使えないやつじゃん、現実に test oracle はない。

hr.icon

次に読むべき論文は？

Unit tests as api usage examples (2010)

Scotch: Improving test-to-code traceability using slicing and conceptual coupling (2011)

EzUnit: A framework for associating failed unit tests with potential programming errors